module: add clearCache for CJS and ESM#61767
Conversation
|
Review requested:
|
90303e6 to
1d0accc
Compare
|
The
notable-change
Please suggest a text for the release notes if you'd like to include a more detailed summary, then proceed to update the PR description with the text or a link to the notable change suggested text comment. Otherwise, the commit will be placed in the Other Notable Changes section. |
|
I’m relatively +1 on having this in Node.js, but I recall having a a lot of discussions about this @GeoffreyBooth and @nodejs/loaders teams about this, and it would massively break the spec, expectations, and invariants regarding ESM. (Note, this is what people have been asking us to add for a long time). My personal objection to this API is that it would inadvertently leak memory at every turn, so while this sounds good in theory, in practice it would significantly backfire in long-running scenarios. An option could be to expose it only behind a flag, putting the user in charge of choosing this behavior. Every single scenario where I saw HMR in Node.js ends up in memory leaks. This is the reason why I had so much interest and hopes for ShadowRealm. |
benjamingr
left a comment
There was a problem hiding this comment.
I am still +1 on the feature from a user usability point of view. Code lgtm.
We're giving users a tool, it may be seen as a footgun by some but hopefully libraries that use the API correctly and warn users about incorrect usage emerge. |
|
@mcollina Thanks for the feedback. I agree the ESM semantics concerns are real. This API doesn’t change the core ESM invariants (single instance per URL); it only removes Node's internal cache entries to allow explicit reloads in opt‑in workflows. Even with that, existing references (namespaces, listeners, closures) can keep old graphs alive, so this is still potentially leaky unless the app does explicit disposal. I’ll make sure the docs call out the risks and the fact that this only clears Node’s internal caches, and I’d like loader team input on the final shape of the API. This commit should address some of your concerns. b3bd79a
Thanks for the review @benjamingr. Would you mind re-reviewing again so I can trigger CI? |
|
Thanks a lot for this ❤️ |
|
Rather than violating ESM invariants, can't node just provide a function that imports a url? i.e. While the given example of: const url = new URL('./mod.mjs', import.meta.url);
await import(url.href);
clearCache(url);
await import(url.href); // re-executes the moduleis indeed not spec compliant, it's perfectly legal to have something like: import { clearCache, importModule } from "node:module";
await importModule(someUrl);
clearCache();
await importModule(someUrl); // reexecute |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #61767 +/- ##
==========================================
- Coverage 89.72% 89.66% -0.06%
==========================================
Files 676 677 +1
Lines 206065 206856 +791
Branches 39508 39619 +111
==========================================
+ Hits 184897 185484 +587
- Misses 13315 13502 +187
- Partials 7853 7870 +17
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
While I am +1 to the idea in general, I am afraid the current API may bring more problem than it solves...see the comments.
(Granted it isn't really a problem unique to this specific design, I think the issue is more that this is not a very well solved problem so far, I don't really know what it should look like, though I think I might be able to point out what it should not look like to avoid adding/re-introducing leaks/use-after-frees that user land workarounds can already manage)
|
I was the one requesting this while sitting next to yagiz today. We take advantage of Module Federation which allows us to distribute code at runtime. However, when parts of the distributed system are updated, it gets stuck in module cache. I've had some workarounds, like attempting to purge require cache - however when it comes to esm, it's a difficult problem. Since we do this distribution primarily in production, and there can be thousands of updates a day, I block esm from being supported because it'll leak memory - which was fine for several years but becoming more problematic in modern tooling. On lambda we cannot just exit a process and bring a new one up without triggering a empty deploy, which has generally been a perf hit to cold start a new lambda vs try and "reset" the module cache for primitive hot reload. Now, I know this might be controversial, or not recommended - but the reality is that many large companies use federation, most fortune 50 companies use it heavily. All of them are relying on userland cobbling I've created. If there is a solution, it would be greatly appreciated by all of my users. I believe this would also be very useful in general for tooling like rspack etc where we have universal dev serves. If invalidation of specific modules causes complexity, I'd be more than happy with a nuclear option like resetModuleCache() which just clears everything entirely. Would be a little slower, but nothing is slower than killing a process and bringing up a new one. "Soft Restart" node without killing it. Don't have much opinion on spec compliance etc, can go through NAPI as well if that would avoid any spec concerns or pushback. |
|
Chiming in to say that re-loading a module is very helpful in tests. We can do this with the fabulous CJS paradigm, but ESM does not have a viable equivalent and it should. |
|
I think there are still quite a few places that need updates/tests - I tried my best to find them, but there are some dusty corners in the module loader that I have never poked at, you might want to take a heap snapshot or write more tests with
|
|
I think I addressed all of your concerns @joyeecheung. Let me know if I missed anything! |
Just pinging @guybedford to speak on the spec concerns. I think we should wait for him or someone similarly knowledgeable about the spec to comment before landing. In general I'm +1 on the feature, assuming it can be safely implemented. My (dim) recollection was that the last time we considered it, it was impossible to modify an ES module after it had been loaded into V8. Has that changed in recent years? How do you handle cases like |
d4fb1b4 to
7e9a7c5
Compare
|
I think one way to test this more robustly (i.e. V8 can actually garbage collect it) might be something like this: // Flags: --expose-internals
const { internalBinding } = require('internal/test/binding');
const { ModuleWrap } = internalBinding('module_wrap');
const { queryObjects } require('node:v8'); // Let's run the test in CJS to reduce the noise from queryObject
let app, rev = 0;
const reload = async () => {
const prev = rev ? `./app.mjs?hmr=${rev}` : null;
if (prev) {
module.clearCache(prev, {
parentURL: import.meta.url,
resolver: "import",
caches: "all",
});
}
app = await import(`./app.mjs?hmr=${++rev}`);
};
(async() {
await reload(); // first load
await reload(); // second
const result = queryObjects(ModuleWrap, { format: 'summary' });
// Validate that result no longer includes module with a wrap whose .url includes `app.mjs?hmr=0`
})();(Or use checkIfCollectableByCounting with ModuleWrap) |
|
@joyeecheung Pushed a new test according to your recommendations. |
|
@joyeecheung @mcollina would you mind re-reviewing? |
|
@guybedford @joyeecheung @GeoffreyBooth I'd like to land this on Monday, but I want to make sure we are all aligned with this change. Would you mind reviewing or leave a comment about your thoughts? Just don't forget that this is an "active development" API which we can iterate over time. |
|
Just wanted to chime in that since going full ESM a few years ago, this has been one of the biggest things I've been missing over CJS, for live reloading NodeJS code for local development based workflows (been using Worker Threads as a fallback). Very much looking forward to this one! 💚 |
test/module-hooks/test-module-hooks-clear-cache-resolve-cache.js
Outdated
Show resolved
Hide resolved
GeoffreyBooth
left a comment
There was a problem hiding this comment.
@guybedford @joyeecheung @GeoffreyBooth I'd like to land this on Monday, but I want to make sure we are all aligned with this change. Would you mind reviewing or leave a comment about your thoughts? Just don't forget that this is an "active development" API which we can iterate over time.
I defer to @joyeecheung on the technical review and to @guybedford on the spec compliance, so if both of them approve then it seems good to me!
One thing I'm wondering about is what about people using this in production. It's obviously designed only for development, but nothing stops someone from using this API anywhere, or for dependencies calling it; is that a concern? Could it be a security concern if a malicious dependency calls this in production? I assume not since what this does is basically already possible in CommonJS, but it might be something worth addressing if only in a comment.
I think according to the threat model this can't be a security concern for Node.js per-se - Node.js trusts all code it's given. Although if the user framework has additional security guarantees they'll need to be responsible in making sure the API is used correctly. Although considering this is a pretty early design and we don't know for sure yet if this is good enough for HMR solution in the wild, it may be a good idea to emit an experimental warning for now and wait for some feedback from HMR solutions actually integrating it with current release to validate the API is not flawed, before removing that warning. Otherwise there's a risk that if this does not work correctly for some packages and requires breaking changes to fix, it can get rather complicated to break if other popular packages start to rely on the design as-is. |
Yeah the production use-cases here are really iffy to me. It seems like people want to use this to work around Lambda cold starts which seems bonkers and there are a million other correct ways to solve that problem. Unit tests should be using For hot reloading you need a more complete solution. Though this PR would fix the memory leak issue. I'm not sure if the concerns raised in #50618 are still relevant but I received some pushback on the specification violations in 2023. |
I am unfamiliar. Can you elaborate? |
Not sure about vm, I remember not being able to get it to work for our use case. But this is what we've been using for hot reloading (with memory leak): Don't get me wrong. I would have loved this pr many years ago but the pushback has always been that this was already possible in userland and not needed. So I'm curious what use cases this pr solves that can't already be done through something discussed here: #49442 Or is this pr just to make those uses cases easier to implement? |
From my side, on Module Federation setups its fairly relevant, regardless the environment. In Module Federation systems, remotes and manifests can change frequently while the same Node process stays warm. The userland options we have today ( That maps directly to what we do in Zephyr as well, where thanks to our system we can take advantage of controlling the serving end, and being able to enable HMR programatically on any URL that we control. Imagine a URL such as |
Stale module graphs seem like a huge concern. Even with this PR existing references to module namespaces cannot be invalidated. In these kinds of deployments are you just doing |
Nah, in Module Federation setups we're not scattering We keep dynamic loading at the federation boundary via Live namespace/object references can’t be revoked once handed out. So we treat updates as cutovers: new requests go to the new graph, in-flight work drains, then old graph cleanup/dispose runs. Sorry if its a bit off-topic 😄 |
I know what the |
A unit test environment should be using ephemeral realms. Otherwise shared global state would cause tests to interfere with each other. Module graphs created via
I see. It's not the deployment model I would choose in an ideal world but I can see how a team would end up here. I think the stakeholders need to make a call on whether or not encouraging the specification violation is important. This is super bike sheddy but I want to point out that the term Arguably hot reloaders are a significant enough departure from a classic module graph that they should be implemented directly via |
My posts throughout this thread have not had any test frameworks in mind. When we consider the operation of |
Introduce Module.clearCache() to invalidate CommonJS and ESM module caches with optional resolution context, enabling HMR-like reloads. Document the API and add tests/fixtures to cover cache invalidation behavior.